The Sampling Distributions and Covariance Matrix of Phylogenetic Spectra

نویسندگان

  • Peter J. Waddell
  • David Penny
  • Michael D. Hendy
  • Greg Arnold
چکیده

We extend recent advances in computing variance-covariance matrices from genetic distances to a sequence method of phylogenetic analysis. These matrices, together with other statistical properties of corrected sequence spectra, are studied as a foundation for more powerful and testable methods in phylogenetics. We start with 8, a vector of the proportion of sites in a sequence of length c showing each of the possible character-state patterns for t taxa. Hadamard conjugations are then used to calculate 9, a vector of the the support for bipartitions, or splits, in the data, after correcting for all implied multiple changes. These corrections are made independently of any tree and are illustrated with Cavender’s two-character-state model. Each entry in f (% excluded) that is not associated with an edge on the tree that generated the data is an invariant (sensu Cavender) with an expected value of 0 as the number of sites c--,00. Under an independent identically distributed model (sites are independent and identically distributed), vector $ is a random sample from a scaled multinomial distribution. Starting from this point, we illustrate the derivation of V[f], the variance-covariance matrix of y. The bias induced by the delta method, a convenient approximation in deriving V[y], is evaluated for both population and sample variance-covariance matrices. It is found to be acceptable in the first case and very good in the second. Likewise bias in 9 due to a logarithmic transform and to short sequences is also acceptable. We infer the marginal distributions of entries in f. Simulations with illustrative values of c and h (the rate per site) show how 4 tends to multivariate normal as c+ co. Our results extend naturally to four-color (nucleotide) spectra.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

EIGENVECTORS OF COVARIANCE MATRIX FOR OPTIMAL DESIGN OF STEEL FRAMES

In this paper, the discrete method of eigenvectors of covariance matrix has been used to weight minimization of steel frame structures. Eigenvectors of Covariance Matrix (ECM) algorithm is a robust and iterative method for solving optimization problems and is inspired by the CMA-ES method. Both of these methods use covariance matrix in the optimization process, but the covariance matrix calcula...

متن کامل

Information and Covariance Matrices for Multivariate Pareto (IV), Burr, and Related Distributions

Main result of this paper is to derive the exact analytical expressions of information and covariance matrix for multivariate Pareto, Burr and related distributions. These distributions arise as tractable parametric models in reliability, actuarial science, economics, finance and telecommunications. We showed that all the calculations can be obtained from one main moment multidimensional integr...

متن کامل

Determination of Maximum Bayesian Entropy Probability Distribution

In this paper, we consider the determination methods of maximum entropy multivariate distributions with given prior under the constraints, that the marginal distributions or the marginals and covariance matrix are prescribed. Next, some numerical solutions are considered for the cases of unavailable closed form of solutions. Finally, these methods are illustrated via some numerical examples.

متن کامل

Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.

The distribution of genetic variance in multivariate phenotypes is characterized by the empirical spectral distribution of the eigenvalues of the genetic covariance matrix. Empirical estimates of genetic eigenvalues from random effects linear models are known to be overdispersed by sampling error, where large eigenvalues are biased upward, and small eigenvalues are biased downward. The overdisp...

متن کامل

An effect of initial distribution covariance for annealing Gaussian restricted Boltzmann machines

In this paper, we investigate an effect that the covariance of an initial distribution for annealed importance sampling (AIS) exerts on the estimation accuracy for the partition functions of Gaussian restricted Boltzmann machines (RBMs). A common choice for an AIS initial distribution is a Gaussian RBM (GRBM) with zero weight connections. Such an initial distribution does not show any covarianc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998